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ABSTRACT 

Systems for processing big data—e.g., Hadoop, Spark, and mas¬ 
sively parallel databases—need to run workloads on behalf of mul¬ 
tiple tenants simultaneously. The abundant disk-based storage in 
these systems is usually complemented by a smaller, but much 
faster, cache. Cache is a precious resource: Tenants who get to 
use cache can see two orders of magnitude performance improve¬ 
ment. Cache is also a limited and hence shared resource: Unlike a 
resource like a CPU core which can be used by only one tenant at a 
time, a cached data item can be accessed by multiple tenants at the 
same time. Cache, therefore, has to be shared by a multi-tenancy- 
aware policy across tenants, each having a unique set of priorities 
and workload characteristics. 

In this paper, we develop cache allocation strategies that speed 
up the overall workload while being fair to each tenant. We build a 
novel fairness model targeted at the shared resource setting that in¬ 
corporates not only the more standard concepts of Pareto-efficiency 
and sharing incentive, but also define envy freeness via the notion 
of core from cooperative game theory. Our cache management plat¬ 
form, ROBUS, uses randomization over small time batches, and we 
develop a proportionally fair allocation mechanism that satisfies the 
core property in expectation. We show that this algorithm and re¬ 
lated fair algorithms can be approximated to arbitrary precision in 
polynomial time. We evaluate these algorithms on a ROBUS pro¬ 
totype implemented on Spark with RDD store used as cache. Our 
evaluation on an industry-standard workload shows that our algo¬ 
rithms provide a speedup close to performance optimal algorithms 
while guaranteeing fairness across tenants. 

1. INTRODUCTION 

Two recent trends in data processing are: (i) the use of multi¬ 
tenant clusters for analyzing large and diverse datasets, and (ii) 
the aggressive use of memory to speed up processing by caching 
datasets. The growing popularity of systems like Apache Spark (4), 
SAP HANA [24], Hadoop with Discardable Distributed Memory m. 
and Tachyon (SI highlight these trends. 

For example, Spark introduces an abstraction called Resilient 
Distributed Dataset (RDD) to represent any data relevant to mod¬ 


ern analytics: files (on a local or distributed file-system), tables 
(horizontally or vertically partitioned), vertices or edges of graphs, 
statistical models learned from data, etc. A user can create an RDD 
directly from data residing on a local or distributed file-system, or 
by applying a transformation to one or more other RDDs. The user 
can then direct the system to cache the RDD in memory. Figure Q] 
gives an example. Computations done on RDDs cached in memory 
run 10-100x faster than when the data resides on disk 0 


1 // read sales data (id, year, product, city, sales) to 

2 sales = sc.textFile("sales.txt") .map( .split(",")) 

3 // find sales of the current year and cache the transf< 

4 . salesThisYear = sales.filter( (1)=="2015"). cache () 

5 // find sales in city "SF" this year 

6 salesThisYear.filter ( (3)=="SF").map(_(4).tolnt).sum 

7 / find ile >f id :.t "iCar" t i 

8 salesThisYear.filter ( (2)=="iCar").map( (4). tolnt) .sum 


Figure 1: A sample Spark program 
Can User-Directed Caching and Multi-tenancy Coexist? User- 
directed caching brings some major challenges in a multi-tenant 
data analytics cluster: 

• A precious resource: Tenants who get to use the in-memory 
cache see many orders of magnitude of performance improve¬ 
ment. However, the cache is also a limited resource since the 
total size of memory in a cluster is usually orders of magnitude 
smaller than the data sizes stored and queried in the cluster. 

• Complications from sharing: Unlike a resource like a CPU core 
which is used by one tenant at a time, a cached data item can 
simultaneously benefit a high-priority and a low-priority tenant. 

• Avoiding cache hogs: Low-priority tenants should not be able 
to hog the available cache, and prevent other tenants from get¬ 
ting the performance benefits they deserve. 

• Utilities differ: Different tenants have different utilities for datasets 
that could be placed in the cache. 

When faced with such challenges, traditional cache allocation poli¬ 
cies can lead to user dissatisfaction, poor or unpredictable perfor¬ 
mance, and low resource utilization. We will illustrate the prob¬ 
lems and opportunities through an example. Consider a social¬ 
networking company, SpaceBook, that runs a multi-tenant cluster 
for analyzing datasets about how its users are using the service. 
Multiple tenants: The predominant practice in the industry is to 
group similar users—e.g., users in the same department—into queues 
(or, pools). Each queue forms a tenant in the cluster. The cluster 
at SpaceBook is used by three tenants: (i) Analyst , the business 
analysts in the company, (ii) Engineer , the developers in the com¬ 
pany who develop data-driven applications such as recommenda¬ 
tion models, and (iii) VP, the top-level management in the company 
such as the CEO and the Chief Security Officer who look at hourly 
and daily reports. 

Cacheable entities: These three tenants will benefit from caching 
one or more of three views — R, S, and P —each of size M bytes. 




Throughout this paper, “view” refers to any data item that can be 
cached to give a performance benefit. For SQL workloads, a view 
corresponds to a SQL expression, like any candidate view gener¬ 
ated by a materialized view selection algorithm 1 32ll58l [8l l42| . For 
broader data analytics—e.g., machine learning and graph processing— 
a view corresponds to a dataset on which the user has put a cache 
directive (recall the example Spark program from Figure U}. 


Tenant 

R 

S 

P 

Analyst 

2 

1 

0 

Engineer 

2 

1 

0 

VP 

0 

1 

2 


Table 1: Utilities of cached views to tenants at SpaceBook 
Utilities: The matrix in Table [I] shows the utility that each tenant 
gets if the corresponding view were to be cached in memory. A 
simple definition of utility we will use in this paper is the savings 
in I/O because data is read from the in-memory cache versus disk. 
For example, if view R is cached in memory, then tenant Analyst 
will get a utility of two units. One common pattern in multi-tenant 
clusters that we bring out in TableQ]is that view R could be the de¬ 
tailed logs that business analysts and developers access quite often; 
view P could be a table that only the top-level management has ac¬ 
cess to; while view S could be a materialized view with aggregated 
information shared by all tenants. 

Scenario 1: Debbie, the cluster DBA, is responsible for allocat¬ 
ing resources so that the tenants get good performance while all 
resources are used effectively. Suppose the in-memory cache at 
SpaceBook has a total size of M bytes. Debbie first configures a 
static and equal partitioning of the cache so that each tenant is enti¬ 
tled to ^ bytes of cache memory. Recall that each of the views R, 
S, and P are M bytes each; so none of them will fit in their y bytes 
of cache. Lienee, resource utilization will be poor and none of the 
tenants will receive any performance boost. 

Scenario 2: Next, Debbie switches to a more common cache al¬ 
location policy, Least Recently Used (LRU). The view R is used 
the most at Spacebook, so it will likely remain cached for the most 
time. Thus, the Analyst and Engineer tenants will see performance 
speedups. However, the VP tenant’s workload will see poor perfor¬ 
mance, causing these users—including Zuck, SpaceBook’s CEO— 
to complain that important reports needed for their decision-making 
are not being generated on time. 

Scenario 3: Debbie decides to give the VP tenant 50% higher pri¬ 
ority than the other tenants. So, she assigns weights to the Ana¬ 
lyst, Engineer, and VP tenants in the ratio 1:1: 1.5. Debbie now 
switches to a policy that allocates the cache based on the weighted 
utility of the tenants. She tells Zuck that his reports will now be 
generated faster. Unfortunately, Zuck will not see any performance 
improvement: view R will still be the only one cached since it has 
the highest weighted utility of4(=2x l + 2x 1); higher than view 
S’ s weighted utility of 3.5 (= 1 x 1 + 1 x 1 + 1 x 1.5), and view P’s 
weighted utility of 3 (= 2 x 1.5). 

Scenario 4: To improve the poor performance seen by the VP 
tenant, Zuck gives Debbie the money needed to double the cache 
memory in the cluster. Now two views will fit in the 2M-sized 
cache. However, even after this massive investment, the VP tenant 
will only see a minor increase in performance compared to the An¬ 
alyst and Engineer tenants: views R and S will now be cached since 
they together have the highest weighted utility of 7.5 (4 for R + 3.5 
for S); higher than 7 for R and P, and 6.5 for S and P. 

Scenario 5: Zuck is very unhappy with Debbie. She now tries to 
improve things by switching back to static partitioning of W- for 
every tenant, causing everyone to get poor performance because 


none of the views fit. In desperation, Debbie now has to go to the 
Analyst and Engineer tenants to request them to stop adding cache 
directives to their workloads. The whole multi-tenant situation be¬ 
comes extremely messy. 

Better scenarios: Let us consider what Debbie would have wanted 
ideally. An alternative in Scenario 3 is to cache view S instead of 

R. While S has a slightly lower weighted utility of 3.5 compared to 
4 for R, all three tenants will see peformance improvements from 
caching S. An alternative in Scenario 4 is to cache R and P which 
will also give performance benefits to all three tenants while only 
being slightly lower in overall weighted utility than caching R and 

S. In particular, the VP tenant will now see major benefits from 
doubling the cache size. 

The above example shows non-trivial nature of the problem of 
cache allocation in multi-tenant setups. There is a need to make 
principled choices when it comes to picking data items to cache. 
This motivates the main challenge we address. 

Develop a cache allocation policy that provides near- 
optimal performance speedups for tenants' workload 
while simultaneously achieving near-optimal fairness 
in terms of the tenants ’ performance. 

Our Contributions. 

• In Section [2] we propose ROB US, a platform to optimize 
multi-tenant query workloads in an online manner using cache 
for speedup. This framework groups queries in small time- 
based batches and employs a randomized cache allocation 
policy on each batch. 

• In Section [3] we consider the abstract setting of shared re¬ 
source allocation within a batch, and enumerate properties 
that we desire from any allocation scheme. We show that the 
notion of core from cooperative game theory captures the 
fairness properties in a succinct fashion. We show that when 
restricted to randomized allocation policies within a batch, a 
simple algorithm termed proportional fairness generates an 
allocation which satisfies fairness properties in expectation 
for that batch. 

• The policies we construct are based on convex programming 
formulations of exponential size. Nevertheless, in Section[4] 
we show that these policies admit to arbitrarily good approx¬ 
imations in polynomial time using the multiplicative weight 
update method. We present implementations of two fair poli¬ 
cies: max-min fairness and proportional fairness. We also 
present faster and more practical heuristics for computing 
these solutions. 

• We show a proof-of-concept implementation of ROBUS on a 
multi-tenant Spark cluster. Motivated by practical use cases, 
we develop a synthetic workload generator to create various 
scenarios. Implementation details and evaluation are pro¬ 
vided in Section [5] Results show that our policies provide 
desirable throughput and fairness in a comprehensive set of 
setups. 

• Finally, our policies are specified abstractly, and as such eas¬ 
ily extend to other resource allocation settings. We discuss 
this in Section [3~4l 

2. ROBUS PLATFORM 

ROBUS (Random Optimized Batch Utility Sharing), shown in 
Figure [2] is the cache management platform we have developed 









for multi-tenant data-parallel workloads. ROBUS is designed to be 
easily pluggable in systems like Hadoop and Spark. Each tenant 
submits its workload in an online fashion to a designated queue 
which is characterized by a weight indicating the tenant’s/a;> share 
of system resources. (Recall our example from Section|T|) 
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Figure 2: ROBUS platform 


As illustrated in Figure [2] ROBUS processes the workload in 
batches by running five steps in a repeated loop. Step 1 removes 
a batch of queries that were submitted in a fixed time interval into 
the tenants’ queues. Step 2 runs an algorithm over this entire batch 
to select a set of views to cache. This computation simultaneously 
optimizes performance and fairness; designing this algorithm is the 
main focus of this paper. 

Step 2 takes three inputs: (i) a set of candidate views for the 
query batch, (ii) a utility estimation model for cached views, and 
(iii) the total cache budget (i.e., memory available for caching). 
The candidate view generation in ROBUS is a pluggable module. 
By default, the candidate views for a SQL query are the base tables 
accessed by the query. For workloads like machine learning and 
graph processing, the candidate views are datasets on which the 
user has put a cache directive (recall the example Spark program 
from Figure |T}. 

Any candidate view selection algorithm from the literature can 
be plugged in to ROBUS 1321 1581 l8l 1421 . To support this feature, 
ROBUS has a pluggable Step 4 where a query can be rewritten 
to use the views selected for caching in Step 2 before being run in 
Step 5. We make use of ROBUS’s pluggability in Section[5]to run a 
candidate view selection algorithm that considers different vertical 
projections of input tables in the workload. In future, we plan to 
extend Step 4 to support re-optimization of the query based on the 
cached views. Re-optimization may change the query plan entirely. 

The utility estimation model is used in the view selection pro¬ 
cedure to estimate the utility provided to a query by any cached 
view. ROBUS currently models these utilities as savings in disk 
I/O costs if the view were to be read off of in-memory cache versus 
disk. This approach keeps the models simple and widely applica¬ 
ble. In future, we plan to incorporate richer utility models from the 
literature that can account for more complex candidate view selec¬ 
tion algorithms that consider interactions among views (e.g., use 
of views can completely change the query plan for a query M)- 
Total utility of a cache configuration to a tenant is computed by 
summing up estimated utilities of the queries submitted by the ten¬ 
ant. The tenant utilities thus computed are used by view selection 
algorithm to recommend optimal set of views to cache. 

In Step 3, the cache is updated with the views selected by our 
algorithm (if they are not already in the cache). The query batch 
is then run via Steps 4 and 5. Every query runs as data-parallel 
tasks in a system like Hadoop or Spark. Our current prototype of 
ROBUS runs on Spark. A task scheduler (e.g., t2l 1271 ) is respon¬ 
sible for allocating system resources to the tasks. Cluster memory 


is divided into two parts: a heap space for run-time objects and a 
cache for the selected views. While the heap is divided across tasks 
and is allocated by the task scheduler, the cache is shared by all the 
queries in the batch simultaneously and managed by ROBUS. 

3. FAIRNESS PROPERTIES AND POLICIES 
FOR SINGLE BATCH 

In this section we study various notions of fairness when re¬ 
stricted to view selection for queries from a single batch. We con¬ 
sider policies that compute allocations that simultaneously provide 
large utility to many tenants, and enforce a rigorous notion of fair¬ 
ness between the tenants. Since this is very related to other re¬ 
source allocation problems in economics I18l l6l fl6l . we draw heav¬ 
ily on that work for inspiration. However, the key difference from 
standard resource allocation problems is that in our setting, the re¬ 
sources (or views) are simultaneously shared by tenants. In con¬ 
trast, the resource allocation settings in economics have typically 
considered partitioning resources between tenants. As we shall see 
below, this leads to interesting differences in the notions of fairness. 

3.1 Fairness and Randomization 

It is well-known in economics m that the combination of fair¬ 
ness and indivisible resources (in our case, the cache and views) 
necessitates randomization. To develop intuition, we present two 
examples. 

First consider a simple fair allocation scheme that for N tenants 
simply allows each tenant to use A of the total cache for her pre¬ 
ferred view(s). It is plausible that some tenants prefer a large view 
that does not fit in this partition but does fit in the cache. There¬ 
fore, letting tenants have ^ probability of using the whole cache 
can have arbitrarily larger expected utility than the scheme which 
with probability 1 lets them use ^ fraction of the whole cache. 

Next, consider a batch wherein two tenants each request a differ¬ 
ent large view such that only one can fit into the cache. In this case, 
there can be no deterministic allocation scheme that does not ignore 
one of the tenants. Using randomization, we can easily ensure that 
each tenant has the same utility in expectation. In fact, utility in ex¬ 
pectation will be the per batch guarantee we seek, which over the 
long time horizon of a workload will lead to deterministic fairness. 

Notation for Single Batch. 

Since our view selection policy works on individual batches at a 
time, the notation and discussion below is specific to queries within 
a batch. Let N denote the total number of tenants. Define: 

DEFINITION 1. A configuration S is the set of views feasible in 
that the sum of the view sizes Y,s gS w at most the cache size. 

Ui(S) denotes the utility to tenant i that would result from caching 
S. which is defined as the sum over all queries in i’s queue of the 
utility for that query. 

ROBUS generates a set Q of configurations which by definition 
can fit in the cache, and assigns a probability xs to cache each con¬ 
figuration 5 G Q. Define the vector of all such probabilities as: 

DEFINITION 2. An Allocation x is the vector corresponding to 
probabilities x s of choosing configuration S normalized so ||x|| = 
LseQ x S = 1 - 

We denote I/, (x) = T.seQ x sBi(S) as the expected utility of tenant 
i in allocation x. ROBUS implements allocation x by sampling a 
configuration from the probability distribution. 

For each tenant i, let U* = maxsI/;(S) denote the maximum 
possible utility tenant i can obtain if it were the only tenant in 










































the system. For allocation x, we define the scaled utility of i as 
V/(x) = We will use this concept crucially in defining our 

fairness notions. 

3.2 Basic Fairness Desiderata 

The first question to ask when designing a fair allocation algo¬ 
rithm is what properties define fairness. There has been much re¬ 
cent work in economics and computer science on heterogeneous re¬ 
source allocation problems and we begin by considering the prop¬ 
erties that this related work examines 1 271 i52l |39l . Note that be¬ 
cause we work within a randomized model, all of these properties 
are framed in terms of expected utility of tenants. 

• Pareto Efficiency (PE): An allocation is Pareto-efficient if 
no other allocation simultaneously improves the expected util¬ 
ity of at least one tenant and does not decrease the expected 
utility of any tenant. 

• Sharing Incentive (SI): This property is termed individual 
rationality in Economics. For N tenants, each tenant should 
expect higher utility in the shared allocation setting than she 
would expect from simply always having access to 4 of the 
resources. Since our allocations are randomized, allocation 
x satisfies SI if for all allocations y with ||y|| < ^ and for 
tenants i, Ui(x) > Uj( y). In other words, V,(x) > ^ for all 
tenants i, where V/(x) is the scaled utility function defined 
above. 

One property that is widely studied in other resource allocation 
contexts is strategy-proofness on the part of the tenants (the notion 
that no tenant should benefit from lying). In our case, since the 
queries are seen by the query optimizer, strategy-proofness is not an 
issue. The above desiderata also omit envy-freeness (that no tenant 
should prefer the allocation to another tenant) which is something 
we revisit later. 

We now consider a progression of view selection mechanisms 
on a single batch from very simple to more sophisticated. As a 
running example, suppose there is a cache of capacity 1. There are 
three views R , S, or P that are demanded by N tenants. Each view 
has unit size, so that we can cache only one view any time. Note 
that this is a drastically simplified example setup only intended to 
build intuition about why certain view selection algorithms might 
fail or are superior to others; our results and experiments do not 
only have unit views, are not limited to three tenants, and may have 
arbitrarily complex utilities compared to these examples. 

We can summarize the input information our view selection might 
see in a given batch in a table (e.g., Table[2} where the numbers rep¬ 
resent utilities tenants get from the views. An allocation here is a 
vector x of three dimensions and ||x|| = 1 that gives the probabili¬ 
ties in our randomized framework xp,x$,xp for selecting the views. 


Tenant 

R 

s 

p 

A 

1 

0 

0 

B 

0 

1 

0 

C 

0 

0 

1 


Table 2: Every tenant gets utility from a different view 


Static Partitioning. 

Recall that static partitioning is the algorithm that simply deter¬ 
ministically allows each of the N tenants to use of the shared re¬ 
source. This algorithm does not take advantage of randomization. 


For the example in Table [2] this algorithm cannot cache anything 
because each user only gets to decide on the use of ^ of the cache. 
The algorithm is sharing incentive in the standard deterministic set¬ 
ting, but is trivially not Pareto efficient and is not sharing incentive 
in expectation either. As mentioned previously, such examples mo¬ 
tivate the randomization framework to start with. 

Random Serial Dictatorship. 

A natural progression from static partitioning is to consider ran¬ 
dom serial dictatorship (RSD), a mechanism that is widely con¬ 
sidered (HJ Hi for problems such as house allocation and school 
choice. We order the tenants in a random permutation. Each tenant 
sequentially computes the best set of views to cache (in the residual 
cache space) to maximize its own utility. In the example in Table[2] 
each tenant gets a ^ chance of picking her preferred resource (since 
in a random permutation each tenant has a ^ chance of appearing 
first) so the allocation is x =< xp = ^,jts = ^ ,xp = 5 >, where 
each tenant has the same utility in expectation. In fact, it is easy to 
prove that RSD is always SI: Each tenant has ^ chance of being 
first in the random ordering, so its scaled utility is at least ^. 

However, in contrast with resource partitioning problems, our 
problem has a shared aspect that RSD fails to capture. For example, 
consider the situation in Table [3] RSD computes the same alloca¬ 
tion as in the example in Table: [2] x =< xp = ^ ,x$ = ^ ,xp = ^ >. 
However, on this example, though RSD is SI, it is not Pareto- 
efficient (PE). Tenants A and C have expected utility of 1 (a 3 
chance of getting 2 if they come first in the permutation and a 4 
chance of getting 1 if B does) and tenant B has expected utility of 
3 with this allocation. However, if we used allocation x =< xp = 
0,xs = 1 ,xp = 0 > then tenants A, B, and C all have utility 1, which 
is strictly better for tenant B and as good for tenants A and C. RSD 
fails to capture the fact that while each tenant may have different 
top preferences, many tenants may share secondary preferences. 


Tenant 

R 

s 

p 

A 

2 

1 

0 

B 

0 

1 

0 

C 

0 

1 

2 


Table 3: Every tenant gets utility from the same view 


Utility Maximization Mechanism. 

We next consider the mechanism which simply maximizes the 
total expected utility of an allocation, i.e., argmax x £,■{/; (x). It is 
easy to check that this mechanism can ignore tenants who do not 
contribute enough to the overall utility. In other words, it cannot be 
SI. 

Max-min Fairness (MMF). 

In this algorithm we combine previous insights to optimize per¬ 
formance subject to fairness constraints to get a mechanism that is 
both SI and PE. For allocation x, let v(x) = (V) (x), V 2 M, ■ • •, Vjv(x)) 
denote the vector of scaled utilities of the tenants. We choose an al¬ 
location x so that the vector v(x) is lexicographically max-min fair. 
This means the smallest value in v(x) is as large as possible; sub¬ 
ject to this, the next smallest value is as large as possible, and so on. 
We present algorithms to compute these allocations in Section[4] 

THEOREM 1. The MMF mechanism is both PE and SI. 

PROOF. The RSD mechanism guarantees scaled utility of at least 
jj to each tenant. Since the MMF allocation is lexicographically 
















max-min, the minimum scaled utility it obtains is at least the min¬ 
imum scaled utility in RSD, which is at least A. To show PE, note 
that if there were an allocation that yielded at least as large utility 
for all tenants, and strictly higher utility for one tenant, the new 
allocation would be lexicographically larger, contradicting the def¬ 
inition of MMF. □ 


Tenant 

R 

s 

T\ 

1 

0 

t 2 

1 

0 



Tn 

0 

1 


Table 4: All tenants except one get utility from the same view 

Consider the example in Table [4] It is easy to see that the MMF 
value is \ and can be achieved with an allocation of < xr = \ , AS = 
5 >. This allocation is both SI and PE. 

3.3 Envy-freeness and the Core 

The above discussion omits one important facet of fairness. A 
fair allocation has to be envy free, meaning no tenant has to envy 
how the allocation treats another tenant. In the case where re¬ 
sources are partitioned between tenants, such a notion is easy to 
define: No tenant must derive higher utility from the allocation 
to another tenant. However, in our setting, resources (views) are 
shared between tenants, and the only common factor is the cache 
space. In any allocation x, each tenant derives utility from certain 
views, and we can term the expected size of these views as the 
cache share of this user. 

One could try to define envy-freeness in terms of cache space as 
follows: No tenant should be able to improve expected utility by 
obtaining the cache share of another tenant. But this simply means 
all tenants have the same expected cache share. Such an allocation 
need not be Sharing Incentive. Consider the example in Table [5] 
where each view R and S has size 1 and the cache has size 1. The 
only allocation that equalizes cache share caches S entirely. But 
this is not SI for tenant B. 


Tenant 

R S 

A 

0 1 

B 

100 1 


Table 5: Counterexample for perfect Envy-freeness 

This motivates taking the utility of tenants into account in defin¬ 
ing envy. However, this quickly gets tricky, since the utility can be a 
complex function of the entire configuration, and not of individual 
views. In order to develop more insight, we use an analogy to pub¬ 
lic projects. The tenants are members of a society, who contribute 
equal amount of tax. The total tax is the cache space. Each view is 
a public project whose cost is equal to its size. Users derive utility 
from the subset of projects built (or views cached). In a societal 
context, users are envious if they perceive an inordinate fraction of 
tax dollars being spent on making a small number of users happy. 
In other words, if they perceive a bridge to nowhere being built. 
Let us revisit the example in Table [4] Here, the MMF allocation 
sets x =< xr = ^,x$ = 4 > and ignores the fact that an arbitrarily 
large number of tenants want R, compared to just one tenant who 
wants S. If we treat R as a school and S as a park, an arbitrarily 
large number of users want a school compared to a park, yet half 
the money is spent on the school, and half on the park. This will be 
perceived as unfair on a societal level. 


Randomized Core. 

In order to formalize this intuition, we borrow the notion of core 
from cooperative game theory and exchange market economics 1291 
mmm. We treat each user as bringing a rate endowment of jj 
to the system. If they were the only user in the system, we would 
produce an allocation x with ||x|| = ^ and maximize their utility. 
An allocation x over all tenants lies in the core if no subset of ten¬ 
ants can deviate and obtain better utilities for all participants by 
pooling together their rate endowments. More formally, 

DEFINITION 3. All allocation x is said to lie in the (random¬ 
ized) core if for any subset T of N tenants, there is no feasible 
allocation y such that ||y|| = for which Ui(y) > Ui(x)yi e T 
and Uj( y) > Uj(x)for at least one j € T. 

It is easy to check that any allocation in the core is both SI and 
PE, by considering sets T of size 1 and N respectively. In the above 
example (Table[4j- the allocation x =< xr = dj^-,x$ = jj > lies in 
the core. Tenant 7]y gets its SI amount of utility and cache space. 
The more demanded view R is cached by a proportionally larger 
amount. In societal terms, each user perceives his tax dollars as be¬ 
ing spent fairly. Similarly, in the example in Table[5] the allocation 
x =< xr = ^,xs = 4 > lies in the core. 

In the context of provisioning public goods, there are two solu¬ 
tion concepts that are known to lie in the core: The first, termed a 
Lindahl equilibrium [251 HD attempts to find per-tenant prices that 
implement a Walrasian equilibrium, while the second, termed ratio 
equilibrium l38l attempts to find per-tenant ratios of cache-shares. 
However, these concepts are shown to exist using fixed-point the¬ 
orems, which don't lend themselves to efficient algorithmic imple¬ 
mentations. We sidestep this difficulty by using randomization to 
our advantage, and show that a simple mechanism finds an alloca¬ 
tion in the core. 

Proportional Fairness. 

DEFINITION 4. An allocation x is proportionally fair (PF) if it 
is a solution to: 

N 

Maximize ^ log((/,(x)) subject to: ||x|| < 1 (1) 

i=l 


We show the following theorem using the KKT (Karush-Kuhn- 
Tucker) conditions [431. The proof also follows easily from the 
classic first order optimality condition of PF 1501 : however, we 
present the entire proof for completeness. Subsequently, in Sec- 
tionH we show how to compute this allocation efficiently. 


THEOREM 2. Proportionally fair allocations satisfy the core 
property. 

PROOF. Let x denote the optimal solution to (PF). Let d denote 
the dual variable for the constraint ||x|| < 1. By the KKT condi¬ 
tions, we have: 




x s = 0 =>. <d 

rui(x)~ 


Multiplying the first set of identities by xs and summing them, we 
have 


d = d(£x S ) = Y, 

s i 


Is*sUi{S) 

Ui(x) 


= £1 =N 



















This fixes the value of d. Next, consider a subset T of users, 
with |7j = K, along with some allocation y with ||y|| = First 
note that the KKT conditions implied: 


E 

i 


Ui{S) 

Ui(x) 


<N VS 


Multiplying by y$ and summing, we have: 


y y) 

r w 


< Nj^ys = K 

s 


Therefore, 


y Uj{ y) 
kr U iW 


<K 


In fact, the ratio of the utilities of MMF and PF is precisely 
the Jain’s index (37| of the vector (N\,N 2 , ■ ■ ■ ,74). By setting 
k = N/2 + 1, and N 2 = N 3 = • • • = = 1, this shows that (PF) 

can have £2(IV) times larger total utility than MMF. Our next sce¬ 
nario focuses on arbitrary instances with only two tenants. 

Lemma 2. For two tenants, the total utility of (PF) is at least 
the total utility of MMF. 

PROOF. Let the utilities of the two tenants be a,b in (PF) and 
A,B in MMF. Assume a <b. Since MMF maximizes the minimum 
utility, we have a < min(A,B). Let a =A/a and j 8 = B/b, so that 
a > 1. Since log (a) + log (b) = log ( ab ) is maximized by definition 
of PF and log is an increasing function, we have ab > AB, so aft < 
1. Since a > 1, this implies 1//J > a > 1. Therefore 


Therefore, if 17; (y) > t//(x) for some i £ T, then there exists jTT 
for which Uj( y) < Uj(x). This shows that no subset T can devi¬ 
ate to improve their utility, so that the (PF) allocation lies in the 
core. □ 


b —B = B(\/fi — \) > a (1/jS — 1) > a (a — 1) = A — a 
This shows a + b > A + B completing the proof. □ 


3.4 Discussion 

Our notion of core easily extends to tenants having weights. Sup¬ 
pose tenant i has weight A;. Then an allocation x belongs to the 
core if for all subsets T of tenants, there does not exist y with 

||y|| = such that for all tenants i £ T, 17/(x) < Uj (y), and 

Li A; 

Uj(x) < Uj( y) for at least one j £ T. The proportional fairness al¬ 
gorithm is modified to maximize £,■ A/log 17/(x) subject to ||x|| < 1. 

We note that the PF algorithm finds an allocation in the (ran¬ 
domized) core to any resource allocation game that can be speci¬ 
fied as follows: The goal is to choose a randomization over feasible 
configurations of resources. Each configuration yields an arbitrary 
utility to each tenant. This model is fairly general. For instance, 
consider the setting in (27j[52), where resources can be partitioned 
fractionally between agents, and an agent’s rate (utility) depends 
on the minimum resource requirement satisfied in any dimension. 
Suppose we treat each agent as being endowed with ^ fraction of 
the supply of resources in all dimensions, the above result shows 
that the (PF) allocation satisfies the property that no subset of users 
can pool their endowments together to achieve higher rates for all 
participants. 

Utilities under MMF and PF. 

We now compare the total utility, V/(x) for the optimal MMF 
and (PF) solutions. We present results showing that (PF) has larger 
utility than MMF in certain canonical scenarios. Our first sce¬ 
nario defines the following grouped instance: There are k views, 
1,2,... ,k each of unit size. The cache also has size 1. There are k 
groups of tenants; group i has Nj tenants all of which want view i. 

Lemma 1. The total utility of (PF) is at least the total utility of 
MMF for any grouped instance. 

PROOF. On grouped instances, MMF sets rate l/k for each ten¬ 
ant, yielding a total utility of N/k for N tenants. The (PF) algorithm 
sets rate x/ = Nj/N for all tenants in group i. This yields total utility 
of £,-iV? /N. Next note that 

Noting that £,-iV/ = N, it is now easy to verify that (PF) yields larger 
utility. □ 


Summary of Fairness Properties. 

In summary, Table [ 6 ] shows the fairness properties that hold for 
all of our candidate algorithms. We abbreviate the properties SI for 
sharing incentive and PE for pareto efficiency. Based on this analy¬ 
sis, we suggest that proportional fairness is likely to be a preferable 
view selection algorithm for our ROBUS framework. The theoret¬ 
ical properties of proportional fairness suggest that it should per¬ 
form fairly and efficiently. 


Algorithm 

SI 

PE 

CORE 

Random Serial Dictatorship 

/ 



Utility Maximization 


/ 


Max-Min Fairness 

/ 

/ 


Proportional Fairness 

/ 

/ 

/ 


Table 6 : Fairness properties of mechanisms 


4. APPROXIMATELY COMPUTING PF AND 
MMF ALLOCATIONS 

In this section, we show that the PF and MMF allocations can 
be computed to arbitrary precision. We then present fast heuristic 
algorithms for approximately computing PF and MMF allocations, 
which we implement in our prototype. 

One key issue in computation is that the number of configura¬ 
tions is exponential in the number of views and tenants, so that the 
convex programming formulations have exponentially many vari¬ 
ables. Nevertheless, since the programs have 0(N) constraints, we 
use the multiplicative weight method GHHD to solve them ap¬ 
proximately in time polynomial in N and accuracy parameter 1 /e. 
These algorithms assume access to a welfare maximization subrou¬ 
tine that we term WELFARE. 

Definition 5. Given weight vector vt, Welfare(w) computes 
a configuration S that maximizes weighted scaled utilities, i.e., solves 
argmaxsY!iL\WiVi{S). 

The scaled utilities are computed using the tenant utility model 
described in Section[2] In our presentation, we assume WELFARE 
solves the welfare maximization problem exactly. Our algorithms 
will make polynomially many calls to WELFARE. 


















Multiplicative Weight Method. 

We first detail the multiplicative weight method, which will serve 
as a common subroutine to all our provably good algorithms. This 
classical framework (ED HD uses a Lagrangian update to decide 
feasibility of linear constraints to arbitrary precision. 

We first define the generic problem of deciding the feasibility of 
a set linear constraints: Given a convex set P £ R s , and anrxj 
matrix A, 


LP (A,b,P)\ 3x £ P such that Ax > bl 


THEOREM 4. An approximation algorithm computes an addi¬ 
tive e approximation to (PF) with 0( AN *° g N ) calls to WELFARE, 
and polynomial additional running time. 

Proof. 

For allocation x, let fi(x) = £,TogV;(x). Let Q* = max x B(x) 
denote the optimal value of (PF), and let x* denote this optimal 
value. We first present a Lipschitz type condition, whose proof we 
omit from this version. 


Let y > 0 be an r dimensional dual vector for the constraints LEMMA 3. Let y satisfy B( y) > Q* e for e e (0,1 /6). Then, 

Ax > b. We assume the existence of an efficient ORACLE of the y or a p yfyj > V,(x)/2 
form: 


Oracle C(A,y) = max{y'Az : z £ P}. 


The ORACLE can be interpreted as follows: Suppose we take a 
linear combination of the rows of Ax, multiplying row a,x by y,-. 
Suppose we maximize this as a function of x £ P, and it turns out 
to be smaller than y T b. Then, there is no feasible way to satisfy 
all constraints in Ax > b, since the feasible solution x would make 
y T Ax > y T b. On the other hand, suppose we find a feasible x. Then, 
we check which constraints are violated by this x, and increase the 
dual multipliers v; for these constraints. On the other hand, if a con¬ 
straint is too slack, we decrease the dual multipliers. We iterate this 
process until either we find a y which proves Ax > b is infeasible, 
or the process roughly converges. 

More formally, we present the Arora-Hazan-Kale (AHK) proce¬ 
dure mo for deciding the feasibility of LP(A,£>,P). The running 
time is quantified in terms of the WIDTH defined as: 

p = max max | a,x — b, \ 
i x eP 


Algorithm 1 AHK Algorithm 

h LetK^&^y^l 
2: for t = 1 to K do 

3: FindXf using ORACLE C(A,y t ). 

4: if C(A,y f ) < yf b then 

5: Declare LP (A,b,P) infeasible and terminate. 

6 : end if 

7: for i = 1 to r do 

8 : Mi, = aiX t — bj > Slack in constraint i. 

9: y;t+i <- y/r (1 - d) M “IP if M it > 0. 

10: y it+ 1 <- yil (l + 8r Mi '/P if M it < 0. 

11: > Multiplicatively update y. 

12 : end for 

13: Normalize y t+ \ so that ||y t+ i|| = 1. 

14: end for 

15: Returnx = f Y,f=i x t- 


This procedure has the following guarantee HD: 

THEOREM 3. IfLP(A,b,P) is feasible, the AHKprocedure never 
declares infeasibility, and the final x satisfies: 


The proof idea is to use the concavity of the log function to ex¬ 
hibit a convex combination of x and y whose value exceeds Q *, 
which is a contradiction. It is therefore sufficient to find Q* to an 
additive approximation in order to achieve at least half the welfare 
of (PF) for all tenants. Towards this end, for a parameter Q, we 
write (PF) as a feasibility problem PFFeas(< 2) as follows: 

DEFINITION 6 . PFFeas((2) decides the feasibility of the con¬ 
straints 

(F)-{E^(5) — W >°vj 

subject to the constraints: 

(P!) = {x;xs< 1 ,xs>0Vs| 


(P2) 


£log H>Q, r, e [l/N, 1] Vi 


The above formulation is not an obvious one, and is related to 
virtual welfare approaches recently proposed in Bayesian mecha¬ 
nism design H3CE5). The key idea is to connect expected val¬ 
ues (utility) to their realizations in each configurations via expected 
value variables, the y,. The constraints (P2) and (PI) are over ex¬ 
pected values, and realizations respectively. The ORACLE compu¬ 
tation in the multiplicative weight procedure will decouple into op¬ 
timizing expected value variables over (P2), and optimizing WEL¬ 
FARE over (PI) respectively, and both these problems will be easily 
solvable. 

We note that (P2) has additional constraints y £ [1 /2V, 1] Vi. 
These are in order to reduce the width of the constraints (F). Note 
that otherwise, y can take on unbounded values while still being 
feasible to (P2), and this makes the width of (F) unbounded. The 
lower bound of 1 /N on y is to control the approximation error in¬ 
troduced. We argue below that these constraints do not change our 
problem. 

LEMMA 4. Let Q* denote the optimal value of the proportional 
fair allocation (PF). Then, PFFeas(<2) is feasible if and only if 

Q<Q*- 


(ajX — bj) + 8> 0 Vi 

4.1 Proportional Fairness 

Our algorithm uses the AHK algorithm as a subroutine and con¬ 
siders dual weights to find an additive e approximation solution. 
The primary result is the following theorem: 


PROOF. In the formulation PFFeas(< 2), the quantity y is sim¬ 
ply the scaled utility of tenant i. Consider the proportionally fair 
allocation x. For this allocation, all scaled utilities lie in [1/tV, 1] 
since the allocation is SI. Therefore, x is feasible for PFFeas(( 2*)- 
On the other hand, if y is feasible to PFFEAS(g) for Q > Q*, then 
y is also feasible for (PF), contradicting the optimality of x. □ 










We will therefore search for the largest Q for which PFFeas(< 2) 
is feasible. Since each Yi £ [1/iV, 1], we have Q £ [—NlogN,0]. 
Therefore, obtaining an additive e approximation to Q* by binary 
search requires O(loglV) evaluations of PFFeas(< 2) for various Q, 
assuming constant e > 0. 

Solving PFFeas(< 2). We now fix a value Q and apply the AF1K 
procedure to decide the feasibility of PFFEASjg). To map to the 
description in the AFIK procedure, we have b = 0, and A is the 
LF1S of the constraints (F). We have r = N. Since any V,(S) < 1, 
and ji £ [1 /N. 1], the width p of (F) is at most 1. Finally, for small 
constant e > 0, we will set 8 = jp. Therefore, K = 4N 

For dual weights w, the oracle subproblem C(A, w) is the follow¬ 
ing: 

Max x,yE( w 'VKx)-tf) 

i 

subject to (PI) and (P2). This separates into two optimization prob¬ 
lems. 

The first sub-problem maximizes L, H’,-V/(x) subject to x satisfy¬ 
ing (PI). This is simply Welfare(w). The second sub-problem is 
the following: 

Minimize E h’,T 5 
i 

subject to w satisfying (P2). Let L denote the dual multiplier to the 
constraint Y,i log Yi > Q- Consider the Lagrangian problem: 

MinimizeE( vv i?i' — Llogy;) 

i 

subject to Yi 6 [ I /N. 1] for all i. The optimal solution sets yj(L) = 
max(l/A r , min(l,L/w,)), which is an non-decreasing function of L. 
We check if )T, yi(L) < Q■ If so, we increase L till we satisfy the 
constraint with equality. This parametric search takes polynomial 
time, and solves the second sub-problem. 

The AHK procedure now gives the following guarantee: Either 
we declare PFFeas(< 2) is infeasible, or we find (x, y) such that for 
all i, we have: 

EtsV,-(S) > Yi ~ e/N 2 > 75(1 - e/N) 

s 

Since L/log'ft > Q, the above implies: 

5(x) = ElogV;(x) > 2-E l0 S( 1 - E / N ) >Q- £ 

i i 

so that the value Q — e is achievable with the allocation x. 

Binary Search. To complete the analysis, since PFFeas(< 2*) is 
feasible, the procedure will never declare infeasibility when run 
with Q = <2*, and will find an x with B(x) > Q* — e, yielding an 
additive e approximation. This binary search over Q takes 0(log N) 
iterations. 

Thus, we arrive at the result of theorem[4] 

4.2 Max-min Fairness 

We present an algorithm SimpleMMF that computes an alloca¬ 
tion x maximizing min,- V;(x). The MMF allocation can be com¬ 
puted by applying this procedure iteratively as in [28); we omit the 
simple details from this version. We note that the idea of applying 
the multiplicative weight method to compute max-min utility also 
appeared in ED. 


We write the problem of deciding feasibility as SimpleMMF(/ l): 

(n = |e v <(5)ls' > a v,j 

subject to the constraints: 

(/>) = jx>s<Ms>ovsj 

We have X* £ [1/tV, 1], where A* = max x min, Vj(x). Therefore, 
the width p < 1. Further, we can set 8 = e/N. We can now com¬ 
pute K from the AHK procedure, so that K = 4N *° gJV in order to 
approximate X* to a factor of (1 — e). The procedure is described 
in Algorithmic 


Algorithm 2 Approximation Algorithm for SimpleMMF 
1: Let e denote a small constant < 1. 

2 : T <- 4N y N 

3: wj <— ji > Initial weights 

4: x <— 0 > Probability distribution over set of views 

5: for k £ 1,2 do 
6 : Let S be the solution to WELFARE(wiJ. 

7: w i(k+ 1 ) w ik ex P{- £ up-) 

8 : Normalize wij+i so that ||wk+i|| = 1. 

9: x$ <— xs + y > Add S to collection 

10: end for 


In order to compute MMF allocations, we use a similar idea to 
decide feasibility, except that we have to perform 0(N 2 ) invoca¬ 
tions. This blows up the running time to O ^° gjv j invocations 
of Welfare. 

The algorithm gives the following result: 

THEOREM 5. An approximation algorithm for SimpleMMF 
(Algorithm^ finds a solution x such that min, V;(x) > A*(l — e) 
using calls to Welfare. 

4.3 Fast Heuristics 

In this section, we present heuristic algorithms that directly work 
with the exponential size convex programs. We directly implement 
these algorithms in software to gather our experimental results. 

Configuration Pruning. 

For M = 0(N 2 ), generate M random IV-dimensional unit vectors 
w/ c ,k= 1,2,... ,M. For each w let S * be the configuration corre¬ 
sponding to WELFARE (wf). Denote this set of configurations by 
5X. We restrict the convex programming formulations of PF and 
MMF to just the set of configurations 5?, and solve these programs 
directly, as we describe below. The intuition behind doing this 
pruning step is the following: The approximation algorithms for 
PF and MMF find convex combinations of configurations that are 
optimal for WELFARE (w) for some w’s that are computed by the 
multiplicative weight procedure. Instead of this, we generate ran¬ 
dom such Pareto-optimal configurations, giving sufficient coverage 
so that each tenant has a high probability of having the maximum 
weight at least once. 

We compared two algorithms for SimpleMMF, one using the 
multiplicative weight procedure (Algorithm [2} , and the other solv¬ 
ing the linear program (Program 0 below) restricted to random 










optimal configurations. When run on 200 batches with five ten¬ 
ants, using 5 weight vectors gives a 10.4% approximation to the 
objective of SimpleMMF. With 25 random weight vectors, the 
approximation error is 1.4%, and using 50 random weights, the ap¬ 
proximation error drops to 0.6%. This shows that a small set SF of 
configurations that are optimal solutions to WELFARE (w) for ran¬ 
dom vectors w is sufficient to generate good approximations to our 
convex programs. In our implementation, we set 5? to be the union 
of these configurations along with the configurations generated by 
the SimpleMMF algorithm (Algorithm [2). 

Proportional Fairness. 

We first note that (PF) is equivalent to the following; the proof 
of equivalence follows from Theorem [2] where the dual variable 
corresponding to the constraint Y,s x s = 1 is precisely N. 

N 

Max g(x) = ^ log(V)(x)) — N\\x\\ s.t.: x > 0 (2) 

!= 1 

Given a configuration space M, we can solve the program Q using 
gradient descent, as shown in Algorithm[3] As precomputation, for 
each configuration S £ 5?, we precompute V/(S). Then V)(x) = 
'Ls&yVi{S) x S- 


Algorithm 3 Proportional Fairness Heuristic 


1 : 

LetM = 

= \9\. Set f = 1. 

2 : 

Let xi = 

= (1/M, 1/M,..., 1/M). 

3: 

repeat 


4: 

y = 

Vg(x) evaluated at x = x,. 

5: 

r* = 

= argmaxr (g(x, + ry)) 

6 : 

X/+1 

= x, + r* y 

7: 

Project x, + i as: x c \ = max(.r^,0) for all dimensions d £ 


{1,2,.. 

■ ,M}. 

8 : 

until x t 

converges 


Max-min Fairness. 

Using the precomputed configuration space ,9, we solve Sim¬ 
pleMMF using the following linear program: 

max < A | ^ Vi(S)xs > A Vi,x > 0 > (3) 

[ Se9 ) 

This can be solved using any off-the-shelf LP solver (our imple¬ 
mentation uses the open source lpsolve package Cl)- In order to 
compute the MMF allocation, we iteratively compute the lexico¬ 
graphically max-min allocation using the above LP. The details are 
standard; see for instance (28 L Briefly, in each iteration a value of 
A is computed. All tenants whose rate cannot be increased beyond 
A without decreasing the rate of another tenant are considered sat¬ 
urated and the rate of A for these tenants is a constraint in the next 
iteration of the LP. The solution to the final LP for which all tenants 
are saturated is the MMF solution. 

5. EVALUATION 

We evaluate cache allocation policies on a variety of practical se¬ 
tups of multi-tenant analytics clusters. The setups may differ in the 
number of tenants, workload arrival patterns, data access patterns, 
etc. Some of the example setups are listed below. 

• Analysts: Tenants correspond to various BI analysts in an en¬ 
terprise that run a similar workload. Some of the datasets are 
frequently accessed by all tenants suggesting a good opportu¬ 
nity for shared optimization. 


• ETL+Analysts: All analysts have similar data access patterns 
as above. But additionally, a tenant runs ETL workload that 
may touch different datasets than the BI tenants. 

• Production+Engineering: Engineering workload is of bursty 
nature. Depending on the time of day, engineering queues have 
different amounts of work whereas production queues, running 
pre-scheduled workflows, have similar amounts of work. 

We replicate various combinations of these setups on a small- 
scale Spark cluster and run controlled experiments using a mix 
of TPC-H benchmark [?] workload and a synthetically generated 
scan-based workload. 

5.1 Setup and Methodology 

Figure[2]has presented the architecture of ROBUS. We use Apache 
Spark j4| to build a system prototype. Spark is a natural choice for 
the evaluation since it supports distributed memory-based abstrac¬ 
tion in the form of Resilient Distributed Datasets (RDDs). In our 
prototype, a long running Spark context is shared among multiple 
queues, with each queue corresponding to a tenant. The Spark con¬ 
text has an access to the entire RDD cache in the cluster. Spark’s 
internal fair share scheduler is configured with a dedicated pool for 
each queue; the fair share properties of the pool are set proportional 
to weight of the corresponding queue. 


Spark version 

1 . 1.1 

Number of worker nodes 

10 

Instance type of nodes 

c3.2xlarge 

Total number of cores 

80 

Executor memory 

80GB 


Table 7: Test cluster setup on Amazon EC2 


Table [7] presents our test cluster setup. We generate two types 
of data to reflect two types of uses observed in typical multi-tenant 
clusters: (a) A set of 30 datasets with varying sizes each match¬ 
ing schema of the “sales” tables— store_sales , catalog_sales , and 
web_sales —from TPC-DS benchmark 0 data, and (b) All TPC-H 
benchmark [?] datasets generated at scale 5. 

The first category of data represents raw fact/log data that comes 
into the cluster from the OLTP/operational databases in a com¬ 
pany. This data is processed by synthetically-generated ETL and 
exploratory SQL queries, each performing scans and aggregations 
over a dataset. We refer to this category of queries as the Sales 
workload. Total size of Sales data on disk is 600GB. We create 
a vertical projection view on each dataset on its most frequently 
accessed columns and use it whenever possible to answer queries. 
Sizes of these views when loaded to cache range from 118MB to 
3.6GB as can be seen in Figure[3] 

The second category represents data in the cluster after it has 
been processed by ETL. Note that this data is typically much smaller 
in size compared to fact/log data. In our experiments, this data is 
queried by standard TPC-H benchmark queries which consist of 
a suite of business-oriented analytics and involves more complex 
operations, such as joins, compared to the Sales workload. All of 
the queries in our evaluation are submitted using SparkSQL APIs. 

We set the cache size to 8 GB, 10% of the total executor memory, 
leaving aside the rest as a heap space. Only 6 GB of the cache is 
used to carry out our optimizations in order to avoid memory man¬ 
agement issues our Spark installation experienced while evicting 
from a near-full cache. 

The tenant utility model we use to estimate the utility of a cache 
configuration in our evaluation is based on the observations made 
from real-life clusters in (9). If all the datasets that a query needs 












Figure 3: Cache size estimates of candidate Sales views 
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Figure 4: Workload generation for 
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are cached, then the query is assigned a utility equal to the total 
size of data it reads; which corresponds to the savings in disk read 
I/O. Otherwise, we assign a utility of zero. It is observed in 0 that 
queries do not benefit much from in-memory caching if any part of 
their working set is not cached. 

The workflow is described in Section [2] already. Here we want 
to add the fact that the cache update phase in our evaluation setup 
only marks datasets for caching or uncaching using Spark’s cache 
directives. Spark lazily updates the cache when the first query re¬ 
questing cached data from the batch is scheduled for execution. 

Workload Arrival and Data Access. 

Figure[4]shows our workload generation process. Several studies 
have established that query arrival times follow a Poisson distribu¬ 
tion dong. We use the same in our prototype. Previous studies 
have also indicated that the data accessed by analytical workloads 
follows a Zipf distribution EHED: A small number of datasets are 
more popular than others, while there is a long tail of datasets that 
are only sporadically accessed. To replicate such data access, our 
synthetic Sales workload generator picks a dataset from a Zipfian 
distribution provided at the time of configuration and adds grouping 
and aggregation predicates from a probability distribution defined 
for the chosen dataset. The TPC-H workload generator, on the 
other hand, picks a benchmark query from a probability distribu¬ 
tion over the queries provided at the time of configuration. 

Further, (53] also shows that 90% of recently accessed data is 
re-accessed within next hour of first access. This makes a lot of 
sense because users typically want to drill down a dataset further in 
response to some interesting observation obtained in the previous 
run. In order to support such scenarios, we pick a small window in 
time from a Normal distribution. Over this window, a small subset 
of datasets is chosen from the Zipfian g. This subset forms candi¬ 
dates for the duration of the window. Each query to be generated 
picks one dataset from the candidates uniformly at random. This 
technique is taken from ED which terms the values used in local 


window as “cold” values to differentiate them from globally popu¬ 
lar “hot” values. The generated workload still follows the Zipfian g 
globally. The local distribution is optional; If not provided, datasets 
are picked from Zipfian g at all times. 

5.2 Performance Metrics 

We gather several performance metrics while executing a work¬ 
load. They are defined next. We emphasize that these metrics are 
over long time horizons. 


1. Throughput. This is simple to define: 

„ number of queries served 

Throughput =- 

total time taken 


(4) 


2. Fairness Index. For job schedulers, a performance-based 
fairness index is defined in terms of variance in slowdowns 
of jobs in a shared cluster compared to a baseline case where 
every job receives all the resources 1341 . As our work is 
about speeding up queries, we use relative speedups across 
queries while deriving fairness. The baseline is the case of 
statically partitioned cache. Here, X, is the mean speedup for 
tenant i, and A; is the weight of tenant i. 

(£"-i —) 2 

Fairness index =-%— (5) 

«ILt(|) 2 


3. Average Cache Utilization. This is simply the average frac¬ 
tion of cache utilized during workload execution. 

4. Hit Ratio. The fraction of queries served off cached views. 

Some of the other metrics we collect include flow time, mean 
execution time, mean wait time, and wait time fairness index. They 
are not included due to space constraints. 

5.3 Algorithm Evaluation 

In this section, we evaluate four view selection algorithms on 
various setups. Each algorithm processes a batch of query work¬ 
load in an offline manner as detailed in Section [2] Section dis¬ 
cussed several possible algorithms. Here, we compare the follow¬ 
ing: 

1. STATIC: Cache is partitioned in proportion to weights of the 
tenants. We treat this as baseline when evaluating fairness 
index. 


2. MMF: Max-min fairness implementation described in Sec¬ 
tion [43] 

3. FastPF: Proportional fairness implementation described in 
Section l4.3l 

4. OptP: The only goal is to optimize for query performance; 
Workload from a batch is treated as if belonging to a single 
tenant - a special case of either MMF or FastPF. 

In order to compare these algorithms across various settings, we 
vary the following parameters independently in our experiments. 


1. Data sharing among tenants (Section[53T}; 

2. Workload arrival rate (Section r5.3.2l l; and 

3. Number of tenants (Section[533]l. 
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Table 8: Data access distributions used in evaluation on a mixed 
workload 
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Distributions used by four tenants 

Sfi 

{gl.Sl.Sl.gl} 


{g\,gugl,gl) 


{gl<g\'g2< g 3 ) 

% 

{gl> g2- g 3 . g4 } 


Table 9: Data access distributions used in evaluation on Sales work¬ 
load 


5.3.1 Effect of data sharing among tenants 
To study the impact of different data sharing patterns on the per¬ 
formance of algorithms, we create four different workload distribu¬ 
tions: h\ picks queries uniformly at random over a set of 15 TPC-H 
benchmark queries; gi — g 3 create three different Zipf distributions 
over 30 Sales datasets over which scan-and-aggregation queries are 
generated. Each of the distributions is skewed towards a different 
subset of datasets. Using these distributions, we create four test se¬ 
tups allowing different levels of data sharing, as listed in Table [8] 
The batch size is set to 40 seconds; the inter-query arrival time dis¬ 
tribution for all the tenants is given by Poisson(20); and we run 30 
batches of workload for every data point. 

Figure[5]shows how different algorithms perform in each of these 
setups. Throughput goes down with heterogeneity in data access. 
STATIC policy fails to cache any dataset for TPC-H workload be¬ 
cause each of the queries we generate reads the largest table, lineitem, 
which amounts to ~ 3.8GB, much larger than cache at the dis¬ 
posal of STATIC. The other three policies, on the other hand, can 
serve every query off cache in setup ffi . As a result, they exhibit 
a throughput of more than 2x over STATIC. However, as the het¬ 
erogeneity in data access increases, the gap in throughput narrows. 
Even though the shared policies cache more data, frequent updates 
to cache configuration per batch cause additional delays. We ex¬ 
plore possibility of retaining state of cache in Section [5~4l 

Among the shared cache policies, OptP scores high on through¬ 
put but very low on fairness index, ft uses cache exclusively for 
TPC-H tenants at the cost of degradation to Sales tenants’ perfor¬ 
mance. MMF and FastPF policies, on the other hand show much 
better tradeoffs in terms of performance and fairness to tenants. 

We also repeated the same experiment on Sales data alone. We 
first create four different Zipf distributions over candidate views: 
g\ig2,g3,g4- Each of the distributions is skewed towards a differ¬ 
ent subset of views.We create four test setups, each allowing a dif¬ 
ferent level of data sharing, as listed in Table[9] The other common 
parameters are listed in Table 1 101 

Figure[6]shows how different algorithms perform in each of these 
setups. Throuphput goes down with heterogeneity in data access. 
STATIC performs poorly in all the setups, the performance being 
between 30%-40% worse of the others. Its lower cache utiliza- 


Top 3 views from g\ Top 3 views from g 2 



12 3 12 3 


1 1 MMF 1 1 FastPF 1 1 OptP 

Figure 7: Fraction of time the popular views in setup were 
cached 


tion and lower hit ratio are further indicators of why STATIC is not 
the right choice for cache allocation. There is very little to distin¬ 
guish among the three cache-sharing algorithms. This shows that 
our fair algorithms can provide a throughput close to the optimal. 
In terms of fairness, OptP algorithm gives the most inconsistent 
performance. It scores high in the setup with most heterogeneity, 
but fails when data sharing is involved. MMF and FastPF, on the 
other hand, score high in all the setups. 

The performance of MMF interestingly falls alarmingly low in 
the second setup. This is clearly an outcome of the data sharing 
pattern wherein three of four tenants largely share the same subset 
of views. Recollecting the example presented in Table [4] MMF 
tries to share the cache (probabilistically) equally between the two 
sets of tenants effectively producing an allocation off the core. We 
include a chart showing the duration the most popular views were 
cached for by MMF, FastPF, and OptP. (Figure [7J Top three 
views in each of gi and go serve 25%, 13%, and 8% of the queries 
respectively. It can be seen that while MMF caches the topmost 
view from the distributions roughly equally, FastPF and OptP 
favor the topmost view from gi more since it is shared by three 
tenants. MMF tries to compensate the three tenants by caching 
their second best view more, but this view has a lower utility both 
due to lower access frequency and smaller size. So the overall per¬ 
formance of MMF suffers in this case. 

5.3.2 Effect of variance in query arrival rates 

To replicate the bursty tenants scenarios, we vary query inter¬ 
arrival rates of tenants in a two-tenant setup. We create three setups— 
low , mid , and high —with query inter-arrival rates as listed in Ta- 
blell II The other parameters used in each of the setups are listed in 
Table [H] 


Setup 

Poisson mean, X\ 

Poisson mean. At 

low 

12 

12 

mid 

18 

8 

high 

24 

6 


Table 11: Query inter-arrival rates for different setups 


Figure [8] shows the impact of variance in query arrival rate on 
various metrics. The performance of STATIC remains below the 
other three algorithms as can be seen from the first three graphs. 


Parameter 

Value 

Query inter-arrival rates (sec) 

{20 V tenant} 

Batch size (sec) 

40 

Number of batches 

30 


Parameter 

Value 

Data access distributions 

{gl,gl} 

Batch size (sec) 

12 

Number of batches 

30 


Table 10: Data sharing experiment setup 


Table 12: Query arrival rate experiment setup 
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Figure 5: Effect of data sharing changes on lour equi-paced tenants on a mixed workload 
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Figure 9: Mean speedups provided by different algorithms over 
STATIC policy for the two tenants in the setup high 


The performance gap, however, is small because the cache is par¬ 
titioned in only two parts for STATIC each part being large enough 
to serve 80% of the queries that could be served off unpartitioned 
cache. When it comes to the fairness index, all the algorithms ex¬ 
cept OptP get a near-perfect score. OptP favors the faster tenant 
in both mid and high setups so much that the slower tenant’s perfor¬ 
mance degrades. Figure[9]shows the speedups for MMF, FastPF, 
and OptP relative to STATIC under the setup high. It can be seen 
that the first tenant sees a performance degradation with OptP em¬ 
pirically proving the fact that OptP is not sharing incentive. 

5.3.3 Effect of number of tenants 


Setup 

Poisson mean, A 

2 

10 

4 

20 

8 

40 


Table 13: Query inter-arrival rates for a tenant under different se¬ 
tups 

To further stress the utility of optimizing the entire cache as a 
shared resource, we experiment with increasing number of tenants. 
Specifically, we consider scenarios with 2, 4, and 8 tenants, all 
using the same distribution over dataset access. We try to keep 
the number of queries per batch the same by doubling query inter¬ 
arrival rate with doubling of the number of tenants, batch size re¬ 
maining the same across the setups. Table 1131 lists the query inter¬ 
arrival rates we used. The other parameters common across the 
setups are listed in Tablell4l 

Figure [Tol shows behavior of the algorithms under these scenar¬ 
ios. The gap in throughput between STATIC and the other algo¬ 
rithms is large (35%-45%). As the number of tenants goes up, the 
average cache utilization of STATIC drops sharply, whereas the av¬ 
erage cache utilization of the other algorithms remain largely sta¬ 
ble. This can be attributed to the static partitioning of cache in 
STATIC. The hit ratio shows a similar pattern again showing why 
STATIC is not the best choice. In terms of fairness index, OptP 
finds it increasingly harder to provide a fair solution. With an in¬ 
crease in the number of tenants, the number of queries per tenant 
per batch goes down which makes the locally optimal choices of 
OptP more unlikely to provide equal speedups. In contrast, MMF 
and FastPF, with their randomized choices, score over 0.9 in all 
the scenarios exhibiting their superiority. 

5.4 Discussion and Future Work 


Parameter 

Value 

Data access distributions 

{giV tenant) 

Batch size (sec) 

40 

Number of batches 

30 


Table 14: Number of tenants experiment setup 
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Figure 11: Fairness index as a function of number of batches 


Our evaluation on practical setups brings up some interesting in¬ 
sights that opens up multiple possibilities for the future. We discuss 
some of the challenges and the directions here. 

Our experiments show that across all setups, FastPF and MMF 
provide far better trade-offs in throughput and fairness compared 
to STATIC and OptP. We note that in comparing max-min fair and 
proportional fair implementations, there is no clear winner. We 
believe this is a second order difference that a more precise cost 
model and implementations of the exact algorithms (for instance, 
the algorithm in Section |4~T1 for proportional fairness) will bring 
out. However, even given similar empirical results, PF has the ad¬ 
vantage of the core property as a succinct and easy to explain notion 
of fairness. 

We next note that the running time of our algorithms is polyno¬ 
mial in number of tenants. In most typical industry setups, the ones 
we evaluated, there is only a handful number of tenants. Therefore, 
we expect our algorithms to be fast even in the wild. Just to quan¬ 
tify the query wait times, we observed them to be of the order of 
tens of milliseconds in most cases. 

Convergence Properties. 

As our algorithms are randomized in nature, it is important to 
study how long they take to converge to solutions that yield fair¬ 
ness across time. After running several workloads, we find that 
the number of batches to achieve convergence is very small, of the 
order of 15-25. In Figure |TT] we present results of a four tenant 
workload with 50 batches, optimized once using MMF and once 
using FastPF. The fairness index was computed after every 2-4 
batches. It can be seen that both algorithms converge to their re¬ 
spective optimal values at around 20 batches. As a future work, we 
plan to systematically study which parameters define rate of con¬ 
vergence of the algorithms. 

Batch Size and Cache State. 

Our batched processing architecture introduces additional opti¬ 
mization choices. Primarily, there are two ways of tuning a view 
selection algorithm: 

1. Controlling batch size, and 

2. Managing state of cache across batches. 

The first option needs no elaboration. The second option is whether 
the cache is treated as stateful or as stateless when optimizing a 
batch. In the former case, the estimated benefit of views that are al¬ 
ready in cache is boosted by a factor y > 1. This influences the next 
cache allocation, and makes it more likely for these views to stay in 
the cache. The latter case ignores the state of the cache when con¬ 
sidering the next batch. All the results presented so far have used 
stateless cache. 

We empirically compared how the algorithms react to these pa¬ 
rameters. Figure[l2]shows effect of change in batch size on two ver- 

























Throughput(/min) 


Average cache utilization 


Hit ratio 


Fairness index 


10 

4 2"' » • .. 

1 

1 1 



1 

- -ft-• - 

W « • .. ' 

** - ••• . • -g 

n £ 




8 


0.5 


0.4 


0.8 

\ 


6 

- • 



0.2 

* 




4 


0 




0.6 

_ . , 

- 


24 8 24 8 24 8 24 8 


• Static - ■ - MMF o FastPF - * - OptP 
Figure 10: Fttect ot changing number ot tenants 


Throughput(/min) Fairness index 




--B-- MMFsl * MMFsf o FastPFsl - ® - FastPFsf 
F igure 12: Effect ol batch size on lour equi-paced tenants setup 

sions each of MMF and FastPF: one treating the cache as state¬ 
less (MMFsl and FastPFsl), and the other treating it as stateful 
(MMFsf and FastPFsf), with y = 2. It can be seen that both 
versions provide similar throughput in all the cases. It can be ob¬ 
served that the stateful algorithms score higher on fairness for the 
smallest batch size but there is no clear pattern seen when the batch 
size is larger. It makes sense since the lower batch sizes do not 
give enough choices for fair configurations of cache and maintain¬ 
ing the state results in an artificial increase of the batch size thereby 
providing better configurations. As a future work, we plan to ex¬ 
plore these trade-offs on a larger scale to devise better guidelines 
on parameter tuning. 

Engineering issues. 

We now highlight some challenges in scaling up our experiments 
to industry scale. These challenges are tied to engineering issues 
in current implementations of systems such as Spark, and will get 
ironed out over time. Most common multi-tenant Spark setups use 
a separate Spark context for each tenant, effectively partitioning 
cache. In fact, most current multi-tenant data warehouse systems 
recommend splitting memory across queues. This is in part due 
to multi-thread management challenges that result in unpredictable 
behavior such as premature eviction of cached data blocks. Another 
engineering issue, specific to Spark, is the inordinately long delays 
in garbage collection when cluster scales up. We should be able 
to see a much better impact of ROBUS optimization once these 
practical issues get resolved. 

Code Base. 

The code base of ROBUS has been open-sourced (3]| and our 
entire experimental setup can be replicated following a simple set 
of instructions provided with the code. 

6. RELATED WORK 


Physical design tuning and Multi-query optimization. 

Classical view materialization algorithms in databases (32] EU 
umEzi treat entire workload as a set and optimize towards one 
or more of the flow time, space budget, and view maintenance 
costs. Online physical design tuning approaches rnmmm, 
on the other hand, adapt to changes in query workload by modify¬ 
ing physical design. None of the afore-mentioned approaches sup¬ 
port multi-tenant workloads and therefore cannot be used in select¬ 
ing views for caching. However, some of the techniques used, in 
particular candidate view enumeration, view matching, and query 
rewrite, can be applied in ROBUS framework. 

Batched optimization of queries was proposed in [!56) and is used 
in many work sharing approaches [59]0 |5T]. ROBUS employs 
batched query optimization likewise, but crucially also ensures that 
each tenant gets their fair share of benefit. 

Fairness theory. 

The proportional fairness algorithm is widely studied in Eco¬ 
nomics (So] [35] ESI as well as in scheduling theory mmm 
@o] [ 33 ] EH ED. In the context of resource partitioning problems 
(or exchange economies) 03 ]22i, it is well-known that a convex 
program, called the Eisenberg-Gale convex program (36l computes 
prices that implement a Walrasian equilibrium (or market clearing 
solution). Our shared resource allocation problem is different from 
allocation problems where resources need to be partitioned, and it 
is not clear how to specify prices for resources (or views) in our 
setting. Nevertheless, we show that there is an exponential size 
convex program using configurations as variables, whose solution 
implements proportional fairness in a randomized sense. 

In scheduling theory, the focus is on analyzing delay proper¬ 
ties 14T| 140]|33l assuming jobs have durations. Our focus is instead 
on utility maximization, which has also been considered in the con¬ 
text of wireless scheduling in | 57HT0l l. The latter work focuses on 
long-term fairness for partitioned resources, where utility of a ten¬ 
ant is defined as sum of discounted utilities across time. The result¬ 
ing algorithms, though simple, only provide guarantees assuming 
job arrivals are ergodic, and if tenants exist forever. They do not 
provide per-epoch guarantees. In contrast, we focus on obtaining 
per-epoch fairness in a randomized sense without ergodic assump¬ 
tions, and on defining the right fairness concepts when resources 
are shared. We finally note that (39] presents dynamic schemes for 
achieving envy-freeness across time; however, these techniques are 
specific to resource partitioning problems and to not directly apply 
to our shared resource setting. 

Multi-tenant architectures. 

Traditionally, the notion of multi-tenancy in databases deals with 
sharing database system resources, viz. hardware, process, schema, 


























among users 

dams). Each tenant only accesses data owned by 
them. Emerging multi-tenant big data architectures, on the other 
hand, allow for entire cluster data to be shared among tenants. This 
sharing of data is critical in our work as it allows the cache to be 
used much more efficiently. 

A critical component of modern multi-tenant architectures, such 
as Apache Hadoop, Apache Spark, Cloudera Impala, is a fair sched¬ 
uler/ resource allocator E|27][34). The resource pool considered 
by these schedulers do not differentiate the cache resource from 
the heap resource and as a result divides the cache among ten¬ 
ants. As seen in our work, partitioned cache setups severely reduce 
optimization opportunities. Some recent approaches treat cache 
as an independent resource when running multiple jobs. PAC- 
Man @ exploits multi-wave execution workflow of Hadoop jobs 
to make caching decisions at the granularity of parallely running 
tasks of a job. In another work, LRU policies for Buffer pool 
memory are extended to meet SLA guarantees of multiple ten¬ 
ants (49). However, none of the approaches exploit the opportu¬ 
nities presented by multi-shared nature of cache resource. In an¬ 
other advancement, distributed analytics systems are supporting a 
distributed cache store shared by multiple tenants EUEQ. ROBUS 
optimizer will be a natural fit for such systems. 

7. CONCLUSION 

Emerging Big data multi-tenant analytics systems complement 
an abundant disk-based storage with a smaller, but much faster, 
cache in order to optimize workloads by materializing views in the 
cache. The cache is a shared resource, i.e., cached data can be 
accessed by all tenants. In this paper, we presented ROBUS, a 
cache management platform for achieving both a fair allocation of 
cache and a near-optimal performance in such architectures. We 
defined notions of fairness for the shared settings using randomiza¬ 
tion in small batches as a key tool. We presented a fairness model 
that incorporates Pareto-efficiency and sharing incentive, and also 
achieves envy-freeness via the notion of core from cooperative game 
theory. We showed a proportionally fair mechanism to satisfy the 
core property in expectation. Further, we developed efficient algo¬ 
rithms for two fair mechanisms and implemented them in a ROBUS 
prototype built on a Spark cluster. Our experiments on various 
practical setups show that it is possible to achieve near-optimal fair¬ 
ness, while simultaneously preserving near-optimal performance 
speedups using the algorithms we developed. 

Our framework is quite general and applies to any setting where 
resource allocations are shared across agents. As future work, we 
plan to explore other applications of this framework. 
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APPENDIX 

A. EXPERIMENT RESULTS 

A.l Results of experiments on effect of data 
sharing on mixed workload 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

7.80 

19.2 

19.2 

19.2 

Avg cache util. 

0.00 

0.83 

0.83 

0.83 

Hit ratio 

0.00 

1.00 

1.00 

1.00 

Fairness index 

LOO 

0.71 

0.71 

0.71 


Table 15: Performance of algorithms on setup c /j\ 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

7.20 

9.00 

10.2 

16.2 

Avg cache util. 

0.08 

0.81 

0.87 

0.92 

Hit ratio 

0.08 

0.54 

0.68 

0.83 

Fairness index 

1.00 

0.83 

0.79 

0.75 


Table 16: Performance of algorithms on setup 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

7.20 

7.50 

7.80 

9.60 

Avg cache util. 

0.16 

0.96 

0.98 

1.00 

Hit ratio 

0.19 

0.53 

0.55 

0.67 

Fairness index 

LOO 

0.77 

0.66 

0.50 


Table 17: Performance of algorithms on setup ^ 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

5.40 

5.40 

5.40 

4.80 

Avg cache util. 

0.24 

0.91 

0.93 

0.96 

Hit ratio 

0.26 

0.43 

0.47 

0.46 

Fairness index 

1.00 

0.81 

0.80 

0.38 


Table 18: Performance of algorithms on setup (#4 


A.2 Results of experiments on effect of data 
sharing on Sales workload 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

6.00 

9.42 

9.42 

10.08 

Avg cache util. 

0.34 

0.87 

0.86 

0.88 

Hit ratio 

0.42 

0.67 

0.67 

0.68 

Fairness index 

LOO 

0.98 

0.94 

0.84 


Table 19: Performance of algorithms on setup &\ 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

5.70 

7.20 

7.44 

8.24 

Avg cache util. 

0.34 

0.93 

0.90 

0.94 

Hit ratio 

0.43 

0.57 

0.61 

0.63 

Fairness index 

LOO 

0.96 

0.92 

0.78 


Table 20: Performance of algorithms on setup 









Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

5.34 

7.44 

7.38 

7.92 

Avg cache util. 

0.30 

0.93 

0.93 

0.94 

Hit ratio 

0.38 

0.60 

0.59 

0.58 

Fairness index 

1.00 

0.98 

0.92 

0.72 


Table 21: Performance of algorithms on setup ^3 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

4.20 

5.64 

5.76 

6.00 

Avg cache util. 

0.28 

0.89 

0.88 

0.92 

Hit ratio 

0.34 

0.50 

0.56 

0.55 

Fairness index 

1.00 

0.96 

0.96 

0.99 


Table 22: Performance of algorithms on setup ^4 


A.3 Results of experiments on effect of query 
arrival rate 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

5.76 

6.42 

6.72 

6.90 

Avg cache util. 

0.77 

0.93 

0.93 

0.94 

Hit ratio 

0.40 

0.50 

0.49 

0.51 

Fairness index 

1.00 

1.00 

0.99 

0.97 


Table 23: Performance of algorithms on setup low 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

6.12 

6.78 

6.96 

6.96 

Avg cache util. 

0.72 

0.90 

0.89 

0.90 

Hit ratio 

0.44 

0.49 

0.49 

0.56 

Fairness index 

1.00 

1.00 

0.98 

0.87 


Table 24: Performance of algorithms on setup mid 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

5.52 

6.12 

6.30 

6.54 

Avg cache util. 

0.69 

0.90 

0.91 

0.91 

Hit ratio 

0.39 

0.48 

0.48 

0.51 

Fairness index 

1.00 

1.00 

1.00 

0.89 


Table 25: Performance of algorithms on setup high 


A.4 Results of experiments on effect of num¬ 
ber of tenants 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

7.00 

10.00 

9.70 

10.40 

Avg cache util. 

0.67 

0.93 

0.93 

0.97 

Hit ratio 

0.50 

0.68 

0.68 

0.68 

Fairness index 

1.00 

0.98 

1.00 

1.00 


Table 26: Performance of algorithms on setup with 2 tenants 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

6.00 

9.40 

9.40 

10.10 

Avg cache util. 

0.34 

0.87 

0.86 

0.88 

Hit ratio 

0.42 

0.67 

0.67 

0.68 

Fairness index 

1.00 

0.98 

0.94 

0.84 


Table 27: Performance of algorithms on setup with 4 tenants 


Metric 

Static 

MMF 

FastPF 

OptP 

Throughput(/min) 

5.34 

8.34 

8.22 

9.18 

Avg cache util. 

0.07 

0.82 

0.82 

0.87 

Hit ratio 

0.26 

0.65 

0.65 

0.68 

Fairness index 

1.00 

0.94 

0.91 

0.78 


Table 28: Performance of algorithms on setup with 8 tenants 

























































